An information theoretic approach to hierarchical clustering combination

نویسندگان

  • Elaheh Rashedi
  • Abdolreza Mirzaei
  • Mohammad Rahmati
چکیده

In Hierarchical Clustering, a set of patterns are partitioned into a sequence of groups represented as a dendrogram. The dendrogram is a tree representation where each node is associated with merging of two (or more) partitions and hence each partition is nested into the next partition. Hierarchical representation has properties that are useful for visualization and interpretation of clustering results. On one hand, different hierarchical clustering algorithms usually produce different dendrograms. On the other hand, clustering combination methods have received considerable interest in recent years and they yield superior results for clustering problems. This paper proposes a framework for combining various hierarchical clustering results which preserves the structural contents of input hierarchies. In this method, first a description matrix is created for each hierarchy, and then the description matrices of the input hierarchies are aggregated to form a consensus matrix from which the final hierarchy is derived. In this framework, we use two new measures for aggregating the description matrices, namely Rényi and Jensen-Shannon Divergences. The experimental and comparative analysis of our proposed framework shows the effectiveness of these two aggregators in hierarchical clustering combination. Keywords– Clustering Combination, Dendrogram descriptor, Divergence Measure, Hierarchical Clustering.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map

Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...

متن کامل

Combination of real options and game-theoretic approach in investment analysis

Investments in technology create a large amount of capital investments by major companies. Assessing such investment projects is identified as critical to the efficient assignment of resources. Viewing investment projects as real options, this paper expands a method for assessing technology investment decisions in the linkage existence of uncertainty and competition. It combines the game-theore...

متن کامل

A hierarchical clusterer ensemble method based on boosting theory

Bagging and boosting are two successful well-known methods for developing classifier ensembles. It is recognized that the clusterer ensemble methods which utilize the boosting concept, can create clusterings with quality and robustness improvement. In this paper, we introduce a new boosting based hierarchical clusterer ensemble method called Bob-Hic. This method is utilized to create a consensu...

متن کامل

Information Theoretic Hierarchical Clustering

Hierarchical clustering has been extensively used in practice, where clusters can be assigned and analyzed simultaneously, especially when estimating the number of clusters is challenging. However, due to the conventional proximity measures recruited in these algorithms, they are only capable of detecting mass-shape clusters and encounter problems in identifying complex data structures. Here, w...

متن کامل

An information theoretic approach for analyzing temporal patterns of gene expression

MOTIVATION Arrays allow measurements of the expression levels of thousands of mRNAs to be made simultaneously. The resulting data sets are information rich but require extensive mining to enhance their usefulness. Information theoretic methods are capable of assessing similarities and dissimilarities between data distributions and may be suited to the analysis of gene expression experiments. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 148  شماره 

صفحات  -

تاریخ انتشار 2015